July 9, 2019
Docker for Data Science: R, ShinyProxy and more
$ sudo docker images
TAG IMAGE ID CREATED SIZE
gbmperf_cpu latest 50e874eb88e1 3 days ago 3.41GB
postgres alpine 5e83e6aa7014 12 days ago 70.8MB
openanalytics/rdepot-repo latest bc7b067e0170 12 days ago 104MB
openanalytics/rdepot-app latest af2ba41e5049 12 days ago 2.82GB
openanalytics/r-base latest 9e7a835c395e 6 weeks ago 585MB
$ sudo docker run -it openanalytics/r-base R
$ sudo docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 3a3c4f4f9d5e openanalytics/r-base "R" 14 seconds ago Up 13 seconds flamboyant_shaw
Kernel namespaces:
PID: isolate allocation of process identifiers
network: isolated network interface controllers, firewall rules, routing tables
mount: file system layout, read-only mount points etc.
user: isolation of user ids
view from inside, view from inside
From inside
$ sudo docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 3a3c4f4f9d5e openanalytics/r-base "R" About an hour ago Up About an hour flamboyant_shaw
Get into container and run bash
sudo docker exec -it 3a3c4f4f9d5e bash top
From outside:
$ sudo docker container top 3a3c4f4f9d5e UID PID PPID C STIME TTY TIME CMD root 9872 9848 0 09:42 pts/0 00:00:00 /usr/lib/R/bin/exec/R root 11257 9848 0 10:51 pts/1 00:00:00 bash
DockerfileFROM an existing image e.g. official image of a certain Linux distributionopenanalytics/r-base Dockerfile$ sudo docker build -t openanalytics/r-base . Sending build context to Docker daemon 152.6kB Step 1/13 : FROM ubuntu:18.04 ---> 1d9c17228a9e Step 2/13 : LABEL maintainer="Tobias Verbeke <tobias.verbeke@openanalytics.eu>" ---> Using cache ---> 8332dc56486d8332dc56486d Step 3/13 : RUN useradd docker && mkdir /home/docker && chown docker:docker /home/docker && addgroup docker staff ---> Using cache ---> d2fb24b21f1a [...]
docker build command.dockerignore to exclude filesFROM: base image to start fromLABEL: add metadata to the image (e.g maintainer)RUN: command to execute (as root)ENV: environment variable to setCOPY: add files from current directory to imageCMD: specifies what command to run within the container$ sudo docker history openanalytics/r-base IMAGE CREATED CREATED BY SIZE COMMENT 9e7a835c395e 6 weeks ago /bin/sh -c #(nop) CMD ["R"] 0B 53556f13a4fb 6 weeks ago /bin/sh -c apt-get update && apt-get instal… 454MB 7751c94215fd 6 weeks ago /bin/sh -c #(nop) ENV R_BASE_VERSION=3.5.3 0B 273e01930784 6 weeks ago /bin/sh -c apt-key adv --keyserver keyserver… 2.38kB aed5a9f91923 6 weeks ago /bin/sh -c echo "deb https://cloud.r-project… 64B 0ac26c2bbe68 6 weeks ago /bin/sh -c #(nop) ENV LANG=en_US.UTF-8 0B 7f7de126a4b8 6 weeks ago /bin/sh -c #(nop) ENV LC_ALL=en_US.UTF-8 0B 30da9ee49150 6 weeks ago /bin/sh -c echo "en_US.UTF-8 UTF-8" >> /etc/… 1.69MB b4b8698f2896 6 weeks ago /bin/sh -c apt-get update && apt-get instal… 42.6MB ff46a7c61686 6 weeks ago /bin/sh -c #(nop) ENV DEBIAN_FRONTEND=nonin… 0B d2fb24b21f1a 6 weeks ago /bin/sh -c useradd docker && mkdir /home/do… 393kB 8332dc56486d 6 weeks ago /bin/sh -c #(nop) LABEL maintainer=Tobias V… 0B 1d9c17228a9e 6 months ago /bin/sh -c #(nop) CMD ["/bin/bash"] 0B <missing> 6 months ago /bin/sh -c mkdir -p /run/systemd && echo 'do… 7B <missing> 6 months ago /bin/sh -c rm -rf /var/lib/apt/lists/* 0B <missing> 6 months ago /bin/sh -c set -xe && echo '#!/bin/sh' > /… 745B <missing> 6 months ago /bin/sh -c #(nop) ADD file:c0f17c7189fc11b6a… 86.7MB
git clone https://github.com/szilard/GBM-perf.git cd GBM-perf/cpu sudo docker build --build-arg CACHE_DATE=$(date +%Y-%m-%d) -t gbmperf_cpu . sudo docker run --rm gbmperf_cpu
Reproducibility: Java needed, R packages from Github
Try Ctrl-C (SIGINT) first…
$ sudo docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 23a015e33372 gbmperf_cpu "/bin/sh -c 'cd GBM-…" About a minute ago Up About a minute 8787/tcp quirky_sutherland $ sudo docker stop 23a015e33372 23a015e33372 $ sudo docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
sudo docker stop is friendlier than sudo docker killxstatR: an Environment for Running R and XLISP-STAT in Docker Containers (talk on Thursday useR!2019)$ sudo docker pull hello-world
[...]
$ sudo docker run hello-world
Hello from Docker!
This message shows that your installation appears to be working correctly.
To generate this message, Docker took the following steps:
1. The Docker client contacted the Docker daemon.
2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
(amd64)
3. The Docker daemon created a new container from that image which runs the
executable that produces the output you are currently reading.
4. The Docker daemon streamed that output to the Docker client, which sent it
to your terminal.
[...]
Sample Dockerfile
FROM openanalytics/r-base
CMD ["R", "-q", "-e", "cat('Hello useR!2019')"]
build it
sudo docker build -t openanalytics/hello-user2019
push to the registry
$ sudo docker push openanalytics/hello-user2019 The push refers to repository [docker.io/openanalytics/hello-user2019] d5742bf4b34d: Preparing edf956298918: Preparing [...] c8dbbe73b68c: Waiting 2fb7bfc6145d: Waiting denied: requested access to the resource is denied
$ sudo docker login Login with your Docker ID to push and pull images from Docker Hub. If you don't have a Docker ID, head over to https://hub.docker.com to create one. Username: openanalytics Password: WARNING! Your password will be stored unencrypted in /home/tverbeke/.docker/config.json. Configure a credential helper to remove this warning. See https://docs.docker.com/engine/reference/commandline/login/#credentials-store Login Succeeded
$ sudo docker push openanalytics/hello-user2019 The push refers to repository [docker.io/openanalytics/hello-user2019] d5742bf4b34d: Pushed edf956298918: Pushed 298019512dd9: Pushed d7fd52e782b4: Pushed 55792dcdca25: Pushed 4835240d7abd: Pushed 2c77720cf318: Mounted from openanalytics/r-shiny 1f6b6c7dc482: Mounted from openanalytics/r-shiny c8dbbe73b68c: Mounted from openanalytics/r-shiny 2fb7bfc6145d: Mounted from openanalytics/r-shiny latest: digest: sha256:c24e2dfd4ca2cd29a7cf0da802a97f1336c1079ab94208ad94edcbe54bd5a349 size: 2408
sudo docker pull openanalytics/hello-user2019 sudo docker run openanalytics/hello-user2019
Images have names, identifiers and tags
$ sudo docker images | head -n 2 REPOSITORY TAG IMAGE ID CREATED SIZE openanalytics/hello-user2019 latest 682c0cc54795 11 hours ago 585MB
Look back at command:
sudo docker build -t openanalytics/hello-user2019
First finding: docker adds this latest tag by default.
$ sudo docker build --help | grep 'tag ' -t, --tag list Name and optionally a tag in the 'name:tag' format
$ sudo docker tag openanalytics/hello-user2019 openanalytics/hello-user2019:0.0.1 $ sudo docker images | grep hello openanalytics/hello-user2019 0.0.1 682c0cc54795 20 hours ago 585MB openanalytics/hello-user2019 latest 682c0cc54795 20 hours ago 585MB hello-world latest fce289e99eb9 6 months ago 1.84kB $ sudo docker run openanalytics/hello-user2019:0.0.1
r-ver: specific versions of Rrstudio: adds rstudiotidyverse: adds tidyverse & devtoolsverse: adds tex & publishing-related packagesgeospatial: adds geospatial librariesr-base: latest R releaser-devel: development version of R added as RD next to R release Rrdr: lightweight version of R-devel, built less regularly$ sudo docker pull rocker/r-ver:devel $ cd ~/git/rdepot-demo/examples $ sudo docker run -it -v /home/tverbeke/git/rdepot-demo/examples/oaColors_0.0.4.tar.gz:/root/oaColors_0.0.4.tar.gz rocker/r-ver:devel bash cd /root/ R CMD check --as-cran properties_0.0-9.tar.gz * using log directory ‘/root/properties.Rcheck’ * using R Under development (unstable) (2019-07-05 r76788) * using platform: x86_64-pc-linux-gnu (64-bit) * using session charset: UTF-8 * using option ‘--as-cran’ * checking for file ‘properties/DESCRIPTION’ ... OK [...]
sudo docker run -v /path/on/host:/path/in/container:options
ro for read-onlyc:/PATH //c/PATH /c//PATH
In case of a SPACE in the path e.g. “Program Files”, the whole path should be in quotes.
Example:
docker run -it -v "C:/Python app/python-app":/src python-app
dockerdSending build context to Docker daemon 2.048kB
See
https://www.shinyproxy.io/getting-started/#docker-startup-options
[Service] ExecStart= ExecStart=/usr/bin/dockerd -H unix:// -D -H tcp://127.0.0.1:2375
$ sudo docker-compose up
Creating network "distributedutils_distributed-modeling" with the default driver
Creating artemis-server ...
Creating artemis-server ... done
Creating center2-r-session ...
Creating center1-r-session ...
Creating center1-r-session
Creating center2-r-session ... done
Attaching to artemis-server, center1-r-session, center2-r-session
artemis-server | =========================================================================
artemis-server |
artemis-server | JBoss Bootstrap Environment
center1-r-session | > source('/root/center_rsession.R', echo=TRUE)
[...]
version: '3'
services:
artemis-server:
image: registry.openanalytics.eu/public/artemis-server:latest
container_name: artemis-server
ports:
- "8080:8080"
networks:
- distributed-modeling
center1:
image: registry.openanalytics.eu/public/center-r-session:latest
container_name: center1-r-session
environment:
CENTER_ID: center1
depends_on:
- artemis-server
command: R -q -e "source('/root/center_rsession.R', echo=TRUE)"
networks:
- distributed-modeling
[...]
rtq-docker Github repository
version: '3'
services:
redis:
image: redis
rtq-worker:
build: rtq-client
environment:
- REDIS_HOST=redis
- REDIS_PORT=6379
depends_on:
- redis
rtq-producer:
build: rtq-client
environment:
- REDIS_HOST=redis
- REDIS_PORT=6379
depends_on:
- redis
sudo docker-compose build
start the redis server and the worker
sudo docker-compose up redis rtq-worker
submit a task to the queue
sudo docker-compose run rtq-producer \ R -q -e 'rtq::createTask(rtq::RedisTQ(redux::redis_config(), "demo"), list(message = "hello!"))'
Docker Compose file in dedicated Github repository.
version: '3'
services:
nginx:
image: nginx:latest
hostname: nginx
restart: unless-stopped
volumes:
- /etc/localtime:/etc/localtime:ro
- ./nginx.conf:/etc/nginx/nginx.conf:ro
- ./rundeck:/etc/nginx/sites-enabled/rundeck:ro
ports:
- 80:80
depends_on:
- rundeck
networks:
- rundeck
rundeck:
cd ~/git/rdepot-demo sudo docker-compose up
sudo docker run -it -p 3838:3838 \ openanalytics/shinyproxy-demo R -e "shinyproxy::run_01_hello()"
hans / passwordpeter / password$ sudo docker build -t openanalytics/shinyproxy-template .
Sending build context to Docker daemon 74.24kB
Step 1/11 : FROM openanalytics/r-base
---> 9da1c5afb0b1
Step 2/11 : MAINTAINER Tobias Verbeke "tobias.verbeke@openanalytics.eu"
---> Using cache
---> b37ca7a5c47a
Step 3/11 : RUN apt-get update && apt-get install -y sudo pandoc pandoc-citeproc libcurl4-gnutls-dev libcairo2-dev libxt-dev libssl-dev libssh2-1-dev libssl1.0.0
---> Using cache
---> 5c0b1258b30e
Step 4/11 : RUN apt-get update && apt-get install -y libmpfr-dev
---> Using cache
---> 90a66fd24433
Step 5/11 : RUN R -e "install.packages(c('shiny', 'rmarkdown'), repos='https://cloud.r-project.org/')"
---> Using cache
---> df73ceafaaf9
Step 6/11 : RUN R -e "install.packages('Rmpfr', repos='https://cloud.r-project.org/')"
---> Using cache
---> 48c016875303
Step 7/11 : RUN mkdir /root/euler
---> Using cache
---> 5e8c836a28e4
Step 8/11 : COPY euler /root/euler
---> Using cache
---> 38e8f9ce4a65
Step 9/11 : COPY Rprofile.site /usr/lib/R/etc/
---> Using cache
---> 2b9ad2a93091
Step 10/11 : EXPOSE 3838
---> Using cache
---> 34726eba58bb
Step 11/11 : CMD ["R", "-e", "shiny::runApp('/root/euler')"]
---> Using cache
---> fedd4a91b9ba
Successfully built fedd4a91b9ba
Successfully tagged openanalytics/shinyproxy-template:latest
proxy:
...
specs:
...
- id: euler
display-name: Euler's Number
description: Compute Euler's number in arbitrary precision
container-cmd: ["R", "-e", "shiny::runApp('/root/euler')"]
container-image: openanalytics/shinyproxy-template
proxy:
...
specs:
- id: dash-demo
display-name: Dash Demo Application
port: 8050
docker-cmd: ["python", "app.py"]
docker-image: openanalytics/shinyproxy-dash-demo
...
Zeppelin notebooks demonstration inside ShinyProxy
shiny:
proxy:
[...]
specs:
- id: zeppelin
display-name: Apache Zeppelin
description: Apache Zeppelin Official Docker
container-image: apache/zeppelin:0.8.1
container-volumes: [ "/tmp/zeppelin/#{proxy.userId}/notebook:/zeppelin/notebook", "/tmp/zeppelin/#{proxy.userId}/logs:/zeppelin/logs", "/tmp/zeppelin/conf:/zeppelin/conf" ]
port: 8080
See this Github repository
shiny:
proxy:
[...]
specs:
- id: rstudio
container-image: openanalytics/shinyproxy-rstudio-ide-demo
container-env:
DISABLE_AUTH: true
USER: "#{proxy.userId}"
port: 8787
container-volumes: [ "/tmp/#{proxy.userId}:/home/#{proxy.userId}" ]
https://github.com/openanalytics/shinyproxy-demo
https://github.com/openanalytics/shinyproxy-template
https://github.com/openanalytics/shinyproxy-config-examples
Orchestration platform for Docker, i.e. to manage services across hosts and at scale (cluster).
Become cloud-vendor independent:
and allow for (infinite) autoscaling!